Young People’s Mental Health Changes, Risk, and Resilience During the COVID-19 Pandemic

This cohort study explores the outcomes of the COVID-19 pandemic on secondary school students’ mental health difficulties, as well the associations with individual, family, friendship, and school characteristics.


April
Non-essential retail, hairdressers, public buildings (e.g., libraries, museums) reopen.Outdoor venues, including pubs, restaurants, zoos and theme Parks also open, as well as indoor leisure (e.g., gyms).Self-contained holiday accommodation opens.Wider social contact rules continue to apply in all settings -no indoor mixing between different households allowed.

May
Limit of 30 people allowed to gather outdoors.'Rule of six' or two households allowed for indoor social gatherings.Indoor venues reopen, including pubs, restaurants, cinemas.Up to 10,000 spectators can attend the very largest outdoor-seated venues like football stadiums.

June
Restrictions on weddings and funerals abolished.Four weeks delay in the release of further rules, as the government accelerates the vaccination programme.

July
Most legal limits on social contact removed in England, and the final closed sectors of the economy reopen.

Timeline of students responding at T4 (Cohort 2) according to covid-19 restrictions
Total of students providing assent at T4 (Cohort 2): N = 3184 The adjustment to the changing circumstances (i.e., lockdown and return to school), resulting from the Covid-19 pandemic, on the student's life was measured using the following two items that were designed for the purpose of the present study, and that were considered independently: a) "For some people, the lockdown made their lives better and for others it has made it worse.How did lockdown affect your life?"; and b) "For some people, going back to school after lockdown has made their lives better and for others it has made it worst.How has the return to school after lockdown affected your life?".Both (single item) questions were answered on a 5-point Likert-type scale that ranged from 1 ("life was much worse") to 5 ("life was much better").

Explanatory factors
Student factors (measured at T3) included age (in years), gender identity (male, female, other/prefer not to say) and self-classified ethnicity (using the following options defined by the research team: White, Arab, Asian, Black/African/Caribbean, mixed/multiple ethnic groups, other ethnic groups).Due to the small numbers observed in our sample and to facilitate data analyses, we recoded this variable as 'White' and 'other ethnic groups' (which included Arab, Asian, Black/African/Caribbean, mixed/multiple ethnic groups, other ethnic groups).We categorized students' ethnicity in this way as 'White' vs 'other ethnic groups' can be significantly related to mental health outcomes in adolescence. 6We measured the student's year group across the home nations as follows: The student-rated school-climate (measured at T3) was assessed using the "Alaska School Climate and Connectedness Survey" (SCCS). 7The SCCS student version measures aspects of school climate (e.g., social and environmental factors that contribute to the subjective experience of a school), and connectedness (e.g., perceptions and feelings about the people at school) for students, asking them to consider the way the school is "most of the time".The SCCS scale includes 40 items (7 subscales), all with 5point Likert responses (from 1= "strongly agree" to 5 = "strongly disagree").For the current study, 21 questions from the original SCCS questionnaire were employed, including the 'School Leadership and Student Involvement' (e.g., "At school, decisions are made based on what is best for students"), 'Respectful Climate' (e.g., "My teachers treat me with respect"), 'Peer Climate' (e.g., "Students in this school help each other, even if they are not friends"), and 'Caring Adults' (e.g., "There is at least one adult at this school whom I feel comfortable talking to about things that are bothering me") subscales.We used total scores which were calculated by summing the corresponding subscale scores divided by the number of items, with higher scores representing a better school climate (range: 1-5).The internal consistency of the student rated school-climate variable (SCCS total score) in our study sample at T3 was α = 0.92.
The risk for mental health difficulties variable was defined by Latent Profile Analysis (LPA) reflecting subgroups of children with particular baseline (T0) patterns of risk for mental health difficulties based on student characteristics (age, gender, ethnicity, socialemotional-behavioral difficulties, risk for depression, and well-being), the school's broader context (school urbanity), school community (school deprivation), and school operational features (school social-emotional learning (SEL) ethos). 6LPA was developed in three steps and was conducted using maximum likelihood estimation with cluster (students within schools) robust standard errors.We were interested in classes that are optimally separated and are more likely to reflect 'true' classes in the population, rather than in the full spectrum of heterogeneity.Therefore, we evaluated a series of LPA models containing one to eight latent profiles in a randomly selected sub-sample (splithalf).To validate the structure of the selected latent profile model, we tested LPA models in the second half of the sample, and all subsequent analyses were then developed with the total sample.For model selection, we used the Akaike information criterion (AIC), consistent Akaike information criterion (CAIC), bayesian information criterion (BIC), sample-size-adjusted BIC (sBIC), Lo-Mendell-Rubin adjusted likelihood ratio test (LMR-LRT), as well as bootstrapped likelihood ratio test (BLRT), and we also calculated the index of classification accuracy (entropy). 8All models were well identified, and more latent profiles resulted in lower values of these fit indices (AIC, CAIC, BIC, sBIC), and hence suggested a better model fit.However, the LMR-LRT identified only three profiles, and the best entropy value was obtained with two profiles.In addition, the Elbow Plot showed the steepest slope with only two profiles.Given all this information, we compared the two-and three-profile models for conceptual interpretability and clarity.
The three-profile model replicated the higher risk profile of the two-profile model, and lower risk profile was split into two non-risk profiles.For a better balance between fit and parsimony, and to aid interpretation, we chose the two-profile model.To validate the structure of the selected two-profile model, we repeated the process with the validation sample, which replicated our findings and supported the validity of our two-profile model.Following confirmation of a two-profile model structure from the two independent split-half samples, the dataset was recombined, and the same method of LPA was applied to the full sample.This allowed us to estimate the latent profile measurement model, generating weights that reflect individual profile membership, as well as the measurement error of the latent profile variable.Then, the latent profile variable was used as a factor in the subsequent auxiliary model (i.e., mixed effects linear regression).
The largest subgroup of students (72.8%) was mainly characterized by lower values of risk for depression and social-emotional-behavioral difficulties, as well as higher values of mental well-being.Students in this subgroup were also younger, more often identified as males, and other ethnic backgrounds than Whites, had a higher SEL ethos, and were more often from rural areas.Students in this subgroup were much less likely to be at risk of suffering from mental health difficulties, and thus, this sub-group was labelled as "low risk".On the contrary, the other subgroup of students (27.2%) had higher values of risk for depression and social-emotional-behavioral difficulties, as well as lower values of mental well-being, and were older, more often identified as females and Whites, had a lower SEL ethos, and were more often from urban areas.Students in this subgroup were more likely to be at risk of suffering from mental health difficulties, and thus, this subgroup was labelled as "high risk".The mean values of the "low risk" subgroup were in the low category of risk for depression and social-emotional-behavioral difficulties, and in the medium category of mental well-being; while the mean values of the "high risk" subgroup were in the at-risk category of risk for depression and social-emotionalbehavioral difficulties, and in the low category of mental well-being.We assigned students into their most likely profile based on BCH weights, using variables that reflected the measurement error of the latent profile variable. 9This two-subgroup model, used to assign students into latent profiles, was characterized by high posterior probabilities for all latent profiles across both the total sample and the randomly selected subsamples, suggesting low classification error.For more details on this variable see Montero-Marin et al. (2022). 10

Explanatory factors
Student characteristics of the home environment during lockdown (T4) included household assets, studying conditions, home connectedness and home conflicts.
The internal consistency of the total scale (at T4) was α = 0.62.
Studying conditions at home were assessed using six items (with a "yes" = 1 / "no" = 0 response) that asked about having a quiet space, desk, computer, internet access, regular help from the teacher, and help from a parent/carer.Specifically, students were asked: "During lockdown did you have adequate access to the following support / resources at home? …A quiet space for working or studying, …A desk, …A laptop, tablet, or computer you can work on, …Good internet access, …Regular help from your teacher, school, or college, …Help from a parent or carer".Responses were summed and divided by the number of questions to calculate a total score that ranged from 0 to 1, with higher scores reflecting better studying conditions at home.The internal consistency of the total score of the studying conditions scale (at T4) was α = 0.67.
Home connectedness was measured using the "Family Connectedness Scale". 12This questionnaire consist of six items that were re-scaled into a range from 0 to 1, and then summed and divided by the number of items to calculate a total score, with higher scores reflecting better home connectedness.The items that form this scale are the following: "Could you talk to a parent/caregiver about problems you were having?"("Yes" = 1 / "No" = 0), "How much did you feel your household cared about you?" (from "Not at all" = 1, to "Very much" = 7), "How much did you feel your household cared about your feelings?"(from "Not at all" = 1, to "Very much" = 7), "How much did you feel your household understood you?" (from "Not at all" = 1, to "Very much" = 7), "How much did you feel your household had lots of fun together?"(from "Not at all" = 1, to "Very much" = 7), "How much did you feel your household respected your privacy?" (from "Not at all" = 1, to "Very much" = 7).The internal consistency of the home connectedness scale (at T4) was α = 0.92.
Home conflicts were assessed using the following one-item question: "When parents or other adults in the house got into arguments with each other, others may have seen or heard what is going on.Did you see or hear one of these arguments?",which included the following response options: "yes, lots of times", "yes, sometimes", "yes, but rarely", "no/don't know".

Explanatory factors
Friendships were measured at T4 using the following one-item question: "During lockdown did you have at least one friend who you could turn to for support?", which included the following response options: "yes", "don't know", "no", "prefer not to say".
In order to capture the potential uncertainties around friendships during lockdown, when students were no longer able to see their peers at school, we included all response options as separate categories in the analyses.

School-level characteristics
School-level characteristics refer to the school community, operational features of the school, and broader school context. 6Data were obtained by linking publicly available governmental data to the school's postcode, unless otherwise specified.We selected measures that were directly comparable across all four nations within the UK (England, Northern Ireland, Scotland, and Wales).Otherwise, we mapped existing measures onto their English equivalent (e.g., school quality ratings).We used pre-pandemic measures of school community characteristics, operational features of the school, and broader school context in our study because we were mainly interested in the potential longitudinal relationships between pre-pandemic school-level characteristics and students' mental health difficulties and mental well-being over time.In other words, we wanted to evaluate how differences in preexistent school-level characteristics could be associated with longitudinal change in our outcomes.

Explanatory factors: Characteristics of the school community
School community factors refer to characteristics of the student population at T0, including school deprivation (i.e., % of students eligible for free school meals: in England, children living in households on income-related benefits (such as universal credit) are eligible for free school meals, as long as their annual household income does not exceed £7,400 after tax, not including welfare payments.This is the same in Wales and Scotland, however in Northern Ireland it is set at £14,000 a year), the percentage of students receiving support for special educational needs or disabilities (SEND), and the percentage of students self-classified as White (all range from 0% to 100%).

Explanatory factors: Operational features of the school
Operational features of the school at T0 included the total number of students within a school, student-to-teacher ratio, and coeducation (coeducational school, or femaleonly school).The most recent official school inspection rating (Ofsted) at baseline was used to obtain an ordinal rating of school quality.As the approach to the measurement of school quality differed in public (independent schools) and private schools and across the nations, we mapped all school inspection rating systems onto the following categories: "requires improvements" = 0; "good" = 1; "outstanding" = 2. 13 Quality of SEL provision was assessed through a semi-structured interview with the senior leadership team or a staff member with overall responsibility for teaching SEL, using a list of 16 quality indicators, specifically designed for the original trial. 10SEL in England is taught as part of 'Personal, Social, Health and Economic Education' (PSHE) lessons.Due to the fact that delivering PSHE lessons in schools is not mandatory in England, there is wide variation across schools in the delivery of PSHE lessons (in terms of content covered and teaching time allocated).For inclusion in the study, schools had to meet 5 criteria for their current PSHE provision: regular, discrete, named teaching time for PSHE (or equivalent); a designated PSHE lead; a named member of the Senior Leadership Team (SLT) responsible for PSHE; documentation denoting clear strategic planning of SEL within the school; and evaluation of pupil progress in PSHE.Once schools became a participating school, PSHE was assessed by discussing PSHE provision with the teacher responsible for PSHE at each school (or a member of the Senior Leadership Team).Sixteen quality indicators (see below) were used to assess PSHE provision.They were created specifically for this trial and identified through a review of existing measures and via expert consultation. 14Schools were assigned a total score (a higher school rating (range: 0-16) indicates better SEL provision) reflecting the number of quality indicators present (in the following domains: "Leadership and Strategic Approaches to PSHE", "Curriculum Content and Delivery" and "Assessment, Evaluation, and Consultation").The items used organised by their corresponding domain were the following: Leadership and Strategic Approaches to PSHE from Consensus Indicators A designated PSHE lead (0 = no, 1 = yes) A named member of SLT has responsibility for supporting PSHE (0 = no, 1 = yes) A written PSHE policy (0 = no, 1 = yes) School's own rating of the quality of its PSHE provision (0-4 = 0, 5-10 = 1) PSHE provision is part of the school improvement plan (0 = no, 1 = yes) School SEL ethos (i.e., the underlying values and attitudes the school represents in relation to the way staff and students relate, the development of bonds between youth and adults, and the opportunities for participation in positive social activities), 15 was estimated by a new measure that evaluated the school's commitment to and progress towards mental health and well-being.This measure was developed by gathering existing data from various relevant sources at T0, identifying all those variables that map onto the hypothesized latent construct of school SEL ethos in relation to promoting students' social, emotional and mental well-being.The following school-level measures were considered: official school quality ratings (i.e., Ofsted), 13 average teacher-rated school climate (i.e., a school ecology total score measure aggregated from averaged teacher ratings based on the teacher version of the "School Climate and Connectedness Survey" that included the sub-scales of "School Leadership and Involvement", "Staff Attitudes" and "Respectful Climate"), 6 an assessment of PSHE provision (i.e, quality of SEL provision), and the school commitment to teaching SEL, rated by an independent evaluator and based on the direct observation of the school. 10All these measures were rescaled to a new range from 0 to 4 points to ensure that all the variables contributed equally to the computation of the final index.After this, Pearson's r correlations were calculated (range: from 0.22 to 0.58).Optimal implementation of parallel analysis was used as a dimensionality test to decide on the number of factors to be retained.The number of random correlation matrices used was 500 and the generation of random correlation matrices was based on the permutation of sample values.The advised number of dimensions was 1 when the mean of random percentage of variance was considered, which explained a total of 65% of real-data variance.The robust unweighted least squares (RULS) method, correcting for robust mean and variance adjusted chi-squared statistic, was employed for factor extraction, using the correlation matrix as data entry.
The one-dimensional structure produced standardized loadings between 0.54 and 0.67.The factor determinacy index had a value of 0.85 and marginal reliability showed a value of 0.72.Construct replicability obtained a value of H = 0.72.The omega composite reliability for the unidimensional factor also obtained a value of 0.72.Factor scores were calculated by means of Bayes Expected a Posteriori -EAP-estimates transformed to Tscores, which ranged from 0 to 100, where higher scores represent a more conducive school ethos towards the promotion of social, emotional, and mental well-being.School attainment was obtained from publicly available governmental data, referring to the average attainment of students within a school.7][18] The SCCS was used to assess teacher-rated school climate. 6A total score formed by the "School Leadership and Involvement" (e.g., "At school, decisions are made based on what is best for students"), "Staff Attitudes" (e.g., "Teachers and school staff believe that all students can do good work"), and "Respectful Climate" (e.g., "At this school, students and teachers get along really well") sub-scales was used, with higher scores representing a better school climate (range: 1-5).The internal consistency of the teacher-rated SCCS in our study sample at T3 was α = 0.92.Total scores were calculated by taking the mean across teachers within a school to obtain a school-level measure of teacher-rated school climate.In addition, a measure of student-rated school climate at the school level (i.e., school level student-rated school climate) was also calculated following the same procedure (range: 1-5).The time in school during the third lockdown was measured using the following one-item question: "During the lockdown, did you…" ("stayed at home?", "attended school some of the time?", "still attended school full time?").

Explanatory factors: Broader school context
The broader school context summarises wider socioeconomic factors in the school's catchment area at T0 and includes the following variables: urbanicity (urban vs. rural school location) and area-level deprivation (Index of Multiple Deprivation 2015; IMD, decile rating (from "most deprived" = 1 to "least deprived" = 10)), which summarises deprivation across the categories of income, employment, health/disability, education/skills/training, crime, barriers to services/housing and living environment.The sample size and power calculation were originally determined by the objectives of the main intervention trial (see protocol and update). 21,22For the present study, we carried out a post hoc power calculation to assess whether changes in students' mental health difficulties (i.e., risk for depression and social-emotional-behavioral difficulties) and mental well-being from T3 to T4 differed by cohort status (i.e., Cohort 1 vs. Cohort 2).We used the following (observed) parameters for the calculation: •Type I error: 0.05 •Test: Hotelling-Lawley Trace approach (this approach supports the inclusion of baseline covariates and uses the Wald test for the general linear mixed-effects model)  Under all these conditions, the statistical power obtained was 0.90, which means that our sample is powered to detect small effects (observed Hedges' g ranging from 0.12 to 0.22 in absolute value) for the main study aim.
Maintaining these conditions but reducing the sample size to 54 schools (in Cohort 2, for the secondary aim), provides a statistical power of 0.87.This means that our sample was still adequately powered to detect small effects in the univariable regression analyses.As the multivariable regression analyses included factors that were significant in the univariable analyses, we assume that the statistical power for the multivariable analyses would even increase.By including those factors, we are able to capture more potential sources of variation, which reduces residual variance and can increase the model's ability to detect significant relationships. 24

Schools
Target Univariable analyses using multilevel linear regressions via Maximum Likelihood estimation and three-level mixed effects models for the analysis of the associations between student-and school-level characteristics and changes in adolescents' mental health and well-being between T3 and T4 in Cohort 1 (pre-pandemic).The first step includes the univariable analyses (e.g., age), and the second step includes the univariable analyses + the corresponding two-way interaction (e.g., Time*Age).LRT: Likelihood-ratio test comparing Step 1 vs Step 2. The continuous student-level factors (age, school climate) were group mean (school-level) centred, and therefore the regression coefficients represent an estimate of the differences in individual effects within schools.The continuous school-level factors were introduced as group means, so that these regression coefficients represent school-level effects (i.e., differences between schools). 29Regression coefficients of the interaction terms reflect changes relative to the first assessment (i.e., T4 vs. T3).All models controlled for design variables, trial arm allocation, and the time difference (days) between T3-T4.Univariable analyses using multilevel linear regressions via Maximum Likelihood estimation and three-level mixed effects models for the analysis of the associations between student-and school-level characteristics and changes in adolescents' mental health and well-being between T3 and T4 in Cohort 2 (pandemic).The first step includes the univariable analyses (e.g., age), and the second step includes the univariable analyses + the corresponding two-way interaction (e.g., Time*Age).LRT: Likelihood-ratio test comparing Step 1 vs Step 2. The continuous student-level factors (age, school climate) were group mean (school-level) centred, and therefore the regression coefficients represent an estimate of the differences in individual effects within schools.The continuous school-level factors were introduced as group means, so that these regression coefficients represent school-level effects (i.e., differences between schools). 29Regression coefficients of the interaction terms reflect changes relative to the first assessment (i.e., T4 vs. T3).All models controlled for design variables, trial arm allocation, and the time difference (days) between T3 and T4.The 'other ethnic groups' category includes Arab, Asian, Black/African/Caribbean, mixed/multiple ethnic groups, other ethnic groups.CES-D: Center for Epidemiologic Studies for Depression Scale (range: 0-60).SDQ: Strengths and Difficulties Questionnaire (Total Difficulties Score; range: 0-40 Multivariable analyses, using multilevel linear regressions via Maximum Likelihood estimation and three-level mixed effects models for the analysis of the unique associations between student-and school-level characteristics and changes in adolescents' mental health and well-being between T3 and T4 in Cohort 1 (pre-pandemic), entering those factors that provided significant p-values (p <0.05) in the previous univariable analyses (eTable 4).The continuous student-level factors (school climate) were group mean (school-level) centred, and therefore the regression coefficients represent an estimate of the differences in individual effects within schools.The continuous school-level factors were introduced as group means, so that these regression coefficients represent school-level effects (i.e., differences between schools). 29Regression coefficients of the interaction terms reflect changes relative to the first assessment (i.e., T4 vs. T3).All models controlled for design variables, trial arm allocation, and the time difference (days) between T3 and T4.CES-D: Center for Epidemiologic Studies for Depression Scale (range: 0-60).SDQ: Strengths and Difficulties Questionnaire (Total Difficulties Score; range: 0-40).WEMWBS: Warwick-Edinburgh Mental Well-Being Scale (range: 14-70).† This relationship was no longer significant when the Benjamini-Hochberg correction was applied to correct for multiple testing.NA: Not applicable given the absence of significant results in the univariable analysis.Those variables that did not show significant univariable associations in any of the outcome variables are omitted.In the pre-pandemic Cohort 1, a more positive student-rated school climate (at T3) was related to a decreasing risk for depression and social-emotional-behavioral difficulties, as well as improved mental well-being, but the association between school climate and risk for depression and social-emotional-behavioral difficulties weakened over time (B=1.30(95%CI=0.30,2.30), B=0.66 (95%CI=0.16,1.15), respectively) (see eTable 6).

3.1: Risk for Depression: Center for Epidemiological Studies for Depression Scale (CESD)
Risk for depression scores are predictive margins (eTable 6).

3.2: Social-emotional-behavioral difficulties: Strengths and Difficulties Questionnaire (SDQ)
Social-emotional-behavioral difficulties scores are predictive margins (eTable 6).In the pre-pandemic Cohort 1, students in a coeducational school were associated with lower decreases in mental well-being between T3 and T4 (B=2.44,95% CI=0.57 to 4.31) compared to those in a female-only school, who experienced higher decreases in mental well-being between T3 and T4 (see eTable 6).Mixed-effects linear regressions with maximum likelihood (ML) estimation, including schools (clusters) as random effects and adjusted for the country, school size, coeducation, allocation group and the time difference (days) between T3 and T4.Descriptives are raw data.M (SD): mean (standard deviation).AMD: adjusted mean difference.95% CI: 95% confidence interval.P: p-value.D: Cohen's d effect size.CES-D: Center for Epidemiological Studies for Depression Scale (range: 0-60).SDQ: Strengths and Difficulties Questionnaire (Total Difficulties Score; range: 0-40).WEMWBS: Warwick-Edinburgh Mental Well-being Scale (range: 14-70).Significant differences remained significant when the Benjamini-Hochberg correction was applied to correct for multiple testing.See the eMethods for a description on how the initial risk of mental health difficulties status was estimated.In the pre-pandemic cohort 1, those with a low initial risk for mental health difficulties showed a deterioration in all outcomes over time.This deterioration was greater for those with a low initial risk compared to those with a high initial risk for mental health difficulties, for the outcomes risk for depression (B=2.01 (95% CI=0.37, 3.64) and mental well-being (B=-2.13(95% CI=-3.56,-0.70) (see eTable 6).In the pandemic-exposed Cohort 2, both those with low and high initial risk for mental health difficulties showed deteriorations in outcomes.Deteriorations were greater in those with low (vs.high) initial risk for mental health difficulties (risk for depression (B=2.07 (95%CI=0.82,3.32), social-emotional-behavioral difficulties (B=0.99 (95%CI=0.41,1.57)) (see Table 3 (main manuscript)).

8.1: Risk for Depression: Center for Epidemiological Studies for Depression Scale (CESD)
Risk for depression scores (possible range: 0 to 60) are predictive margins (Table 3).

9.1: Risk for Depression: Center for Epidemiological Studies for Depression Scale (CESD)
Risk for depression scores (possible range: 0 to 60) are predictive margins (Table 3).

9.2: Social-emotional-behavioral difficulties: Strengths and Difficulties Questionnaire (SDQ)
Social-emotional-behavioral difficulties scores (possible range: 0 to 40) are predictive margins (Table 3).We tested the 'student-rated school climate x home connectedness x time' 3-way interaction.This 3-way interaction was significant in the univariable analyses for risk of depression and social-emotional-behavioral difficulties, but not for wellbeing.The 3-way interaction term was then included in the previously estimated multivariable models for risk of depression and social-emotional-behavioural difficulties (see eTable 9 Multivariable analyses, using multilevel linear regressions via Maximum Likelihood estimation and three-level mixed effects models for the analysis of the unique associations between student-and school-level characteristics and changes in adolescents' mental health and well-being between T3 and T4 in Cohort 2 (pandemic), entering those factors that provided significant p-values (p <0.05) in the previous univariable analyses (eTable 5), and including the "Time*Student-rated school climate*Home connectedness" three-way interaction.
The continuous student-level factors (age, school climate) were group mean (school-level) centred, and therefore the regression coefficients represent an estimate of the differences in individual effects within schools.The continuous school-level factors were introduced as group means, so that these regression coefficients represent school-level effects (i.e., differences between schools). 29egression coefficients of the interaction terms reflect changes relative to the first assessment (i.e., T4 vs. T3).All models controlled for design variables, trial arm allocation, and the time difference (days) between T3 and T4.The 'other ethnic groups' category includes Arab, Asian, Black/African/Caribbean, mixed/multiple ethnic groups, other ethnic groups.CES-D: Center for Epidemiologic Studies for Depression Scale (range: 0-60).SDQ: Strengths and Difficulties Questionnaire (Total Difficulties Score; range: 0-40).† This relationship was no longer significant when the correction for multiple testing was applied.NA: Not applicable given the absence of significant results in the univariable analysis.Those variables that did not show significant univariable associations in any of the outcome variables (i.e., CES-D and SDQ) are omitted.The broader context was not included because the corresponding variables were not applicable.

eFigure 10. Relationship between student-rated school climate and risk for depression as a function of home connectedness
To facilitate interpretation, we modeled the student-rated school climate and family connectedness data using three categories (M1SD).CES-D: Center for Epidemiological Studies for Depression Scale.SDQ: Strengths and Difficulties Questionnaire (Total Difficulties Score).WEMWBS: Warwick-Edinburgh Mental Well-being Scale.The 'other ethnic groups' category includes Arab, Asian, Black/African/Caribbean, mixed/multiple ethnic groups, other ethnic groups.Lockdown: adjustment to lockdown.Return: adjustment to school return.Models for 'Lockdown' and 'School' do not include the time interaction, as these outcomes were only measured at T4. Cells printed in white represent factors that are not included in the multivariable analyses.Cells printed in red show the varying p-values (the stronger the colour, the lower the p-value).***p<0.001,**p<0.01,*p<0.05, after correcting for multiple comparisons.

eFigure 3 .
Relationship between student-rated school climate and outcomes as a function of time (Cohort 1)

eFigure 4 .
Relationship between coeducation and outcomes as a function of time (Cohort 1)

eTable 7 .
Descriptive data and within-cohort outcome analyses by initial risk for mental health difficulties

eFigure 5 .
Outcomes by initial risk for mental health difficulties, cohort status, and time point 5.1: Risk for Depression a) Low risk for mental health difficulties (raw scores) risk for mental health difficulties (raw scores) for Depression: Center for Epidemiological Studies for Depression Scale (CES-D).

5. 2 :
Social-emotional-behavioral difficulties a) Low risk for mental health difficulties (raw scores) risk for mental health difficulties (raw scores) -emotional-behavioral difficulties: Strengths and Difficulties Questionnaire (SDQ).

5. 3 :
Mental well-being a) Low risk for mental health difficulties (raw scores) risk for mental health difficulties (raw scores) well-being: Warwick-Edinburgh Mental Well-being Scale (WEMWBS).

eFigure 8 .
Relationship between student-rated school climate and outcomes as a function of time (Cohort 2)

eFigure 9 .
Relationship between friendship and outcomes as a function of time (Cohort 2)

19 eTable 1. Missing data and post hoc power calculation
The initial study sample at T3 consisted of K=12 schools, N=864 students in Cohort 1 and K=72, N=6386 in Cohort 2. Of those, 12 schools and 769 students (89.0%) in Cohort 1 (pre-pandemic), and 54 schools and 2958 students (46.3%) in Cohort 2 (midpandemic), were retained until T4 and provided data on at least one outcome.Therefore, we observed a missing data rate of 11.0% of students in Cohort 1 and 53.7% of students in Cohort 2 (25.0% of schools in Cohort 2), with an overall non-response of 48.6% of students.The specific attrition numbers for each outcome can be seen in the footnote of the next table below (Selected characteristics of pupils included at T3 by T4 follow-up status and cohort).This table shows the differences between students retained and lost to 20llow-up at T4 in student characteristics.As can be seen, students who were retained (vs lost to follow-up) indicated marginally lesser mental health difficulties and greater wellbeing at T3, particularly in Cohort 1 compared to Cohort 2. As this study has more than 40% of missing data, we report the missing data patterns found and the results of the complete case analyses under the missing at random (MAR) assumption, recognizing that our analyses are exploratory.20Ourexploration of the possible missing data mechanisms in the data set (see student characteristics included at T3 by T4 follow-up status and cohort below) showed that, in general, students with higher levels of mental health difficulties and lower levels of mental well-being at T3 were more likely to have missing data.Therefore, our results could be biased towards a more positive view of participating students' initial mental health and mental well-being.This could have affected Cohort 1 and Cohort 2 differently, as they likely showed different missing data patterns, reflecting the different circumstances under which measurements were carried out.In terms of the main aim (i.e., longitudinal cohort comparison), this is a limitation of this study.However, in terms of the secondary aims (i.e., longitudinal relationships between factors and outcomes in the mid-pandemic cohort), finding effects in our more conservative, healthier retained sample would suggest that results obtained are likely robust in the full Cohort 2 sample.To have a measure on how missingness might affect statistical power, we have developed a post hoc statistical power calculation (see below).©2023Montero-MarinJ et al.JAMA Netw Open.*Defined as those pupils with missing data on all 3 primary outcomes at 2-year follow-up.**Definedas those pupils with at least one of the 3 primary outcomes at 2-year follow-up.†Samplesize in lost to follow-up group: 3445: Cohort 1: 95; Cohort 2: 3350.Sample size in remaining students' group: 3675: Cohort 1: 768; Cohort 2: 2907.† † Sample size in lost to follow-up group: 3429: Cohort 1: 95; Cohort 2: 3334. Saple size in remaining students' group: 3670: Cohort 1: 767; Cohort 2: 2903.† † † Sample size in lost to follow-up group: 3517: Cohort 1: 95; Cohort 2: 3422.Sample size in remaining students' group: 3721: Cohort 1: 767; Cohort 2: 2954.† † † † Sample size in lost to follow-up group: 3508: Cohort 1: 94; Cohort 2: 3414.Sample size in remaining students' group: 3717: Cohort 1: 766; Cohort 2: 2951.† † † † † Sample size in lost to follow-up group: 3521: Cohort 1: 95; Cohort 2: 3426.Sample size in remaining students' group: 3723: Cohort 1: 769; Cohort 2: 2954.CES-D: Center for Epidemiologic Studies for Depression Scale.SDQ: Strengths and Difficulties Questionnaire (Total Difficulties Score).WEMWBS: Warwick-Edinburgh Mental Well-Being Scale.School year groups correspond across the home nations as follows: England 9 & 10; Northern Ireland 10 & 11; Scotland S2 & S3. 23

eTable 3. Descriptive data of students' home environment and adjustment during lockdown and return to school, added in Cohort 2 at T4 Variables
56free school meals.OFSTED: Office for Standards in Education, Children's Services and Skills.aKuykenetal., (2017).21uykenetal., (2022).25 cData obtained using online publicly available data published by the education and statistics departments (e.g., Department of Education, 2020; https://gov.uk).26Allavailabledata were collected according to its proximity to the year in which participating pupils provided baseline (T0) questionnaire data.dBriereetal.(2013).27e https://www.sdqinfo.org/norms/UKNorm3.pdf.28 f Clarke et al. (2011).5 FSMStill attended school full time 240 (9.0)How did lockdown affect you, M (SD), range: 1 (life was worse) -5 (life was better) 2.95 (1.27) Going back to school, M (SD), range: 1 (life was worse) -5 (life was better) 2.96 (1.19)A total of 2662 students provided data on 'household assets', 2636 students provided data on 'home connectedness', 2658 students provided data on 'home conflicts', 2127 students provided data on 'studying conditions', 2661 students provided data on 'how did lockdown affect you', 2662 students provided data on 'going back to school', 2661 students provided data on 'during lockdown', 2658 students provided data on 'friend during lockdown'.eFigure 2.

Students' transitions in terms of risk for depression, social-emotional- behavioral difficulties and mental well-being from T3 to T4 by cohort
2.

Univariable analyses of risk for depression, social-emotional-behavioral difficulties, and well-being (Cohort 2)
The 'other ethnic groups' category includes Arab, Asian, Black/African/Caribbean, mixed/multiple ethnic groups, other ethnic groups.CES-D: Center for Epidemiologic Studies for Depression Scale (range: 0-60).SDQ: Strengths and Difficulties Questionnaire (Total Difficulties Score; range: 0-40).WEMWBS: Warwick-Edinburgh Mental Well-Being Scale (range: 14-70).NA: not applicable given the absence of significant results in the LRT comparing Step 1 vs Step 2.

Multivariable analyses of risk for depression, social-emotional-behavioral difficulties, and well-being (Cohort 1)
). WEMWBS: Warwick-Edinburgh Mental Well-Being Scale (range: 14-70).NA: not applicable given the absence of significant results in the LRT comparing Step 1 vs Step 2. eTable 6.

Multivariable analyses for risk for depression and social-emotional-behavioural difficulties, incorporating the "Time*Student-rated school climate*Home connectedness" three-way interaction (Cohort 2)
). LRT: Likelihood-ratio test comparing Model: Student-rated school climate (student level), Home connectedness, Time, 'Student-rated school climate (student level) x Time', 'Home connectedness x Time' vs Model: Student-rated school climate (student level), Home connectedness, Time, 'Student-rated school climate (student level)' x Time, 'Home connectedness x Time', 'Student-rated school climate (student level) x Home connectedness', 'Student-rated school climate (student level) x Home connectedness x Time'.All models include the design variables, trial-arm status, and the time difference (days) between T3 and T4.CES-D: Center for Epidemiological Studies for Depression Scale (range: 0-60).SDQ: Strengths and Difficulties Questionnaire (Total Difficulties Score; range: 0-40).WEMWBS: Warwick-Edinburgh Mental Well-being Scale (range: 14-70).eTable 9.
Evidence map of the associations found in the present study Student-rated school climate (student-level)*Home connectedness Time*Student-rated school climate (student-level)*Home connectedness * Home conflicts sometimes (vs.lots of times) Home conflicts rarely (vs.lots of times) Home conflicts no/don't know (vs.lots of times) Time*Home conflicts sometimes (vs.lots of times) Time*Home conflicts rarely (vs.lots of times) Time*Home conflicts no/don't know (vs.lots of times) Friend don't know (vs.yes) Friend prefer not to say (vs.yes) Time*Friend don't know (vs.yes) * Time*Friend no (vs.yes) Time*Friend prefer not to say (vs.yes) School some of the time (vs.at home) *** School full time (vs.at home) Time*School some of the time (vs.at home) Time*School full time (vs.at home) 30eFigure 11.